Given a natural language that describes the user's demands, the NL2Code task aims to generate code that addresses the demands. This is a critical but challenging task that mirrors the capabilities of AI-powered programming. The NL2Code task is inherently versatile, diverse and complex. For example, a demand can be described in different languages, in different formats, and at different levels of granularity. This inspired us to do this survey for NL2Code. In this survey, we focus on how does neural network (NN) solves NL2Code. We first propose a comprehensive framework, which is able to cover all studies in this field. Then, we in-depth parse the existing studies into this framework. We create an online website to record the parsing results, which tracks existing and recent NL2Code progress. In addition, we summarize the current challenges of NL2Code as well as its future directions. We hope that this survey can foster the evolution of this field.
translated by 谷歌翻译
In recent years, vision-centric perception has flourished in various autonomous driving tasks, including 3D detection, semantic map construction, motion forecasting, and depth estimation. Nevertheless, the latency of vision-centric approaches is too high for practical deployment (e.g., most camera-based 3D detectors have a runtime greater than 300ms). To bridge the gap between ideal research and real-world applications, it is necessary to quantify the trade-off between performance and efficiency. Traditionally, autonomous-driving perception benchmarks perform the offline evaluation, neglecting the inference time delay. To mitigate the problem, we propose the Autonomous-driving StreAming Perception (ASAP) benchmark, which is the first benchmark to evaluate the online performance of vision-centric perception in autonomous driving. On the basis of the 2Hz annotated nuScenes dataset, we first propose an annotation-extending pipeline to generate high-frame-rate labels for the 12Hz raw images. Referring to the practical deployment, the Streaming Perception Under constRained-computation (SPUR) evaluation protocol is further constructed, where the 12Hz inputs are utilized for streaming evaluation under the constraints of different computational resources. In the ASAP benchmark, comprehensive experiment results reveal that the model rank alters under different constraints, suggesting that the model latency and computation budget should be considered as design choices to optimize the practical deployment. To facilitate further research, we establish baselines for camera-based streaming 3D detection, which consistently enhance the streaming performance across various hardware. ASAP project page: https://github.com/JeffWang987/ASAP.
translated by 谷歌翻译
Text Summarization is recognised as one of the NLP downstream tasks and it has been extensively investigated in recent years. It can assist people with perceiving the information rapidly from the Internet, including news articles, social posts, videos, etc. Most existing research works attempt to develop summarization models to produce a better output. However, advent limitations of most existing models emerge, including unfaithfulness and factual errors. In this paper, we propose a novel model, named as Knowledge-aware Abstractive Text Summarization, which leverages the advantages offered by Knowledge Graph to enhance the standard Seq2Seq model. On top of that, the Knowledge Graph triplets are extracted from the source text and utilised to provide keywords with relational information, producing coherent and factually errorless summaries. We conduct extensive experiments by using real-world data sets. The results reveal that the proposed framework can effectively utilise the information from Knowledge Graph and significantly reduce the factual errors in the summary.
translated by 谷歌翻译
A key challenge in federated learning (FL) is the statistical heterogeneity that impairs the generalization of the global model on each client. To address this, we propose a method Federated learning with Adaptive Local Aggregation (FedALA) by capturing the desired information in the global model for client models in personalized FL. The key component of FedALA is an Adaptive Local Aggregation (ALA) module, which can adaptively aggregate the downloaded global model and local model towards the local objective on each client to initialize the local model before training in each iteration. To evaluate the effectiveness of FedALA, we conduct extensive experiments with five benchmark datasets in computer vision and natural language processing domains. FedALA outperforms eleven state-of-the-art baselines by up to 3.27% in test accuracy. Furthermore, we also apply ALA module to other federated learning methods and achieve up to 24.19% improvement in test accuracy.
translated by 谷歌翻译
Adding perturbations via utilizing auxiliary gradient information or discarding existing details of the benign images are two common approaches for generating adversarial examples. Though visual imperceptibility is the desired property of adversarial examples, conventional adversarial attacks still generate traceable adversarial perturbations. In this paper, we introduce a novel Adversarial Attack via Invertible Neural Networks (AdvINN) method to produce robust and imperceptible adversarial examples. Specifically, AdvINN fully takes advantage of the information preservation property of Invertible Neural Networks and thereby generates adversarial examples by simultaneously adding class-specific semantic information of the target class and dropping discriminant information of the original class. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet-1K demonstrate that the proposed AdvINN method can produce less imperceptible adversarial images than the state-of-the-art methods and AdvINN yields more robust adversarial examples with high confidence compared to other adversarial attacks.
translated by 谷歌翻译
In order to test whether artificial intelligence can create qualified classical poetry like humans, the author proposes a study of Chinese classical poetry generation based on a pre-trained model. This paper mainly tries to use BART and other pre training models, proposes FS2TEXT and RR2TEXT to generate metrical poetry text and even specific style poetry text, and solves the problem that the user's writing intention gradually reduces the relevance of the generated poetry text. In order to test the model's results, the authors selected ancient poets, by combining it with BART's poetic model work, developed a set of AI poetry Turing problems, it was reviewed by a group of poets and poetry writing researchers. There were more than 600 participants, and the final results showed that, high-level poetry lovers can't distinguish between AI activity and human activity, this indicates that the author's working methods are not significantly different from human activities. The model of poetry generation studied by the author generalizes works that cannot be distinguished from those of advanced scholars. The number of modern Chinese poets has reached 5 million. However, many modern Chinese poets lack language ability and skills as a result of their childhood learning. However, many modern poets have no creative inspiration, and the author's model can help them. They can look at this model when they choose words and phrases and they can write works based on the poems they already have, and they can write their own poems. The importance of poetry lies in the author's thoughts and reflections. It doesn't matter how good AI poetry is. The only thing that matters is for people to see and inspire them.
translated by 谷歌翻译
The explosion of e-commerce has caused the need for processing and analysis of product titles, like entity typing in product titles. However, the rapid activity in e-commerce has led to the rapid emergence of new entities, which is difficult to be solved by general entity typing. Besides, product titles in e-commerce have very different language styles from text data in general domain. In order to handle new entities in product titles and address the special language styles problem of product titles in e-commerce domain, we propose our textual entailment model with continuous prompt tuning based hypotheses and fusion embeddings for e-commerce entity typing. First, we reformulate the entity typing task into a textual entailment problem to handle new entities that are not present during training. Second, we design a model to automatically generate textual entailment hypotheses using a continuous prompt tuning method, which can generate better textual entailment hypotheses without manual design. Third, we utilize the fusion embeddings of BERT embedding and CharacterBERT embedding with a two-layer MLP classifier to solve the problem that the language styles of product titles in e-commerce are different from that of general domain. To analyze the effect of each contribution, we compare the performance of entity typing and textual entailment model, and conduct ablation studies on continuous prompt tuning and fusion embeddings. We also evaluate the impact of different prompt template initialization for the continuous prompt tuning. We show our proposed model improves the average F1 score by around 2% compared to the baseline BERT entity typing model.
translated by 谷歌翻译
产品捆绑是在线零售中使用的一种常见销售机制。为了设定有利可图的捆绑价格,卖方需要从交易数据中学习消费者的偏好。当客户购买捆绑包或多种产品时,不能使用经典方法(例如离散选择模型)来估计客户的估值。在本文中,我们提出了一种使用捆绑销售数据来了解消费者对产品的估值的方法。该方法将其降低为估计问题,其中样品由多面体区域审查。使用EM算法和蒙特卡洛模拟,我们的方法可以收回消费者估值的分布。该框架允许未观察到的无购买和集群市场细分。我们提供有关概率模型的可识别性和EM算法的收敛性的理论结果。该方法的性能也被数值证明。
translated by 谷歌翻译
近年来,人们见证了应用上下文框架以提高对象检测作为视频对象检测的性能的趋势。现有方法通常一次汇总功能以增强功能。但是,这些方法通常缺少来自相邻帧的空间信息,并且缺乏功能聚合不足。为了解决这些问题,我们执行一种渐进式方式来引入时间信息和空间信息以进行集成增强。时间信息由时间特征聚合模型(TFAM)引入,通过在上下文框架和目标框架之间进行注意机制(即要检测到的框架)。同时,我们采用空间过渡意识模型(StAM)来传达每个上下文框架和目标框架之间的位置过渡信息。我们的PTSeformer建立在基于变压器的检测器DETR上,还遵循端到端的方式,以避免重大的后处理程序,同时在Imagenet VID数据集上获得88.1%的地图。代码可在https://github.com/hon-wong/ptseformer上找到。
translated by 谷歌翻译
我们介绍了在Neurips'22接受的Chalearn Meta学习系列中的新挑战的设计和基线结果,重点是“跨域”元学习。元学习旨在利用从以前的任务中获得的经验,以有效地解决新任务(即具有更好的性能,较少的培训数据和/或适度的计算资源)。尽管该系列中的先前挑战集中在域内几乎没有学习问题,但目的是有效地学习n-way K-shot任务(即N级培训示例的N班级分类问题),这项竞赛挑战了参与者的解决方案。从各种领域(医疗保健,生态学,生物学,制造业等)提出的“任何通道”和“任何镜头”问题,他们是为了人道主义和社会影响而被选为。为此,我们创建了Meta-Album,这是来自10个域的40个图像分类数据集的元数据,从中,我们从中以任何数量的“方式”(在2-20范围内)和任何数量的“镜头”来解释任务”(在1-20范围内)。竞争是由代码提交的,在Codalab挑战平台上进行了完全盲目测试。获奖者的代码将是开源的,从而使自动化机器学习解决方案的部署可以在几个域中进行几次图像分类。
translated by 谷歌翻译